Butterfly Mixing: Accelerating Incremental-Update Algorithms on Clusters

نویسندگان

  • John F. Canny
  • Huasha Zhao
چکیده

Incremental model-update strategies are widely used in machine learning and data mining. By “incremental update” we refer to models that are updated many times using small subsets of the training data. Two wellknown examples are stochastic gradient and MCMC. Both provide fast sequential performance and have generated many of the best-performing methods for particular problems (logistic regression, SVM, LDA etc.). But these methods are difficult to adapt to parallel or cluster settings because of the overhead of distributing model updates through the network. Updates can be locally batched to reduce communication overhead, but convergence typically suffers as the batch size increases. In this paper we introduce and analyze butterfly mixing, an approach which interleaves communication with computation. We evaluate butterfly mixing on stochastic gradient algorithms for logistic regression and SVM, on two datasets. Results show that butterfly mix steps are fast and failure-tolerant, and overall we achieved a 3.3x speedup over full mix (AllReduce) on an Amazon EC2 cluster.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Target Detection Improvements in Hyperspectral Images by Adjusting Band Weights and Identifying end-members in Feature Space Clusters

          Spectral target detection could be regarded as one of the strategic applications of hyperspectral data analysis. The presence of targets in an area smaller than a pixel’s ground coverage has led to the development of spectral un-mixing methods to detect these types of targets. Usually, in the spectral un-mixing algorithms, the similar weights have been assumed for spectral bands. Howe...

متن کامل

Distributed and Cooperative Compressive Sensing Recovery Algorithm for Wireless Sensor Networks with Bi-directional Incremental Topology

Recently, the problem of compressive sensing (CS) has attracted lots of attention in the area of signal processing. So, much of the research in this field is being carried out in this issue. One of the applications where CS could be used is wireless sensor networks (WSNs). The structure of WSNs consists of many low power wireless sensors. This requires that any improved algorithm for this appli...

متن کامل

Type-Based MCMC

Most existing algorithms for learning latentvariable models—such as EM and existing Gibbs samplers—are token-based, meaning that they update the variables associated with one sentence at a time. The incremental nature of these methods makes them susceptible to local optima/slow mixing. In this paper, we introduce a type-based sampler, which updates a block of variables, identified by a type, wh...

متن کامل

Incremental Updates on Mobile Datawarehousing Using Optimized Hierarchical Views and New Aggregation Operators

The use of mobile applications is increasing rapidly. Particularly, mobile applications for decision support systems, such as mobile datawarehouses, are very attractive. However, mobile application imposes new requirements such as disconnection from the data sources. These requirements force the use of incremental update algorithms in order to update the mobile hosts. In this paper, we extend t...

متن کامل

New Proximity Estimate for Incremental Update of Non-uniformly Distributed Clusters

The conventional clustering algorithms mine static databases and generate a set of patterns in the form of clusters. Many real life databases keep growing incrementally. For such dynamic databases, the patterns extracted from the original database become obsolete. Thus the conventional clustering algorithms are not suitable for incremental databases due to lack of capability to modify the clust...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013